Automated Extraction and Retrieval of Metadata by Data Mining - A Case Study of Mining Engine for National Land Survey Sweden
نویسندگان
چکیده
Metadata is the important information describing geographical data resources and their key elements. It is used to guarantee the availability and accessibility of the data. ISO 19115 is a metadata standard for geographical information, making the geographical metadata shareable, retrievable, and understandable at the global level. In order to cope with the massive, high-dimensional and high-diversity nature of geographical data, data mining is an applicable method to discover the metadata. This thesis develops and evaluates an automated mining method for extracting metadata from the data environment on the Local Area Network at the National Land Survey of Sweden (NLS). These metadata are prepared and provided across Europe according to the metadata implementing rules for the Infrastructure for Spatial Information in Europe (INSPIRE). The metadata elements are defined according to the numerical formats of four different data entities: document data, time-series data, webpage data, and spatial data. For evaluating the method for further improvement, a few attributes and corresponding metadata of geographical data files are extracted automatically as metadata record in testing, and arranged in database. Based on the extracted metadata schema, a retrieving functionality is used to find the file containing the keyword of metadata user input. In general, the average success rate of metadata extraction and retrieval is 90.0%. The mining engine is developed in C# programming language on top of the database using SQL Server 2005. Lucene.net is also integrated with Visual Studio 2005 to build an indexing framework for extracting and accessing metadata in database.
منابع مشابه
A Data Mining approach for forecasting failure root causes: A case study in an Automated Teller Machine (ATM) manufacturing company
Based on the findings of Massachusetts Institute of Technology, organizations’ data double every five years. However, the rate of using data is 0.3. Nowadays, data mining tools have greatly facilitated the process of knowledge extraction from a welter of data. This paper presents a hybrid model using data gathered from an ATM manufacturing company. The steps of the research are based on CRISP-D...
متن کاملAutomatic Acquisition of Similarity between Entities by Using Web Search Engine
Web mining is the application of data mining technology to discover patterns from the web. The various tasks on web such as relation extraction, community mining, document clustering and automatic metadata extraction. A previously proposed web-based semantic similarity measures on three benchmark datasets showing high correlation with human rating. One of the main problems in information retrie...
متن کاملPersonal Credit Score Prediction using Data Mining Algorithms (Case Study: Bank Customers)
Knowledge and information extraction from data is an age-old concept in scientific studies. In industrial decision-making processes, the application of this concept gives rise to data-mining opportunities. Personal credit scoring is an ever-vital tool for banking systems in order to manage and minimize the inherent risks of the financial sector, thus, the design and improvement of credit scorin...
متن کاملGeospatial Data Mining on the Web: Discovering Locations of Emergency Service Facilities
Identifying location-based information from the WWW, such as street addresses of emergency service facilities, has become increasingly popular. However, current Web-mining tools such as Google’s crawler are designed to index webpages on the Internet instead of considering location information with a smaller granularity as an indexable object. This always leads to low recall of the search result...
متن کاملMultimedia Annotation System for Intelligent Search*
In this paper we present an overview of the intelligent multimedia annotation and search system MetaOn. The core objective is to construct and integrate semantically rich metadata, extracted from documents and images, to facilitate intelligent search and analysis. The proposed MetaOn framework involves, ontology-based information extraction and data mining, semi-automatic construction of domain...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010